Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient

نویسندگان

  • Konstantin Avrachenkov
  • Nelly Litvak
  • Danil Nemirovsky
  • Natalia Osipova
چکیده

PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. Google computes the PageRank using the power iteration method which requires about one week of intensive computations. In the present work we propose and analyze Monte Carlo type methods for the PageRank computation. There are several advantages of the probabilistic Monte Carlo methods over the deterministic power iteration method: Monte Carlo methods provide good estimation of the PageRank for relatively important pages already after one iteration; Monte Carlo methods have natural parallel implementation; and finally, Monte Carlo methods allow to perform continuous update of the PageRank as the structure of the Web changes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PageRank algorithm and Monte Carlo methods in PageRank Computation

PageRank is the algorithm used by the Google search engine for ranking web pages. PageRank Algorithm calculates for each page a relative importance score which can be interpreted as the frequency of how often a page is visited by a surfer. The purpose of this work is to provide a mathematical analysis of the PageRank Algorithm. We analyze the random surfer model and the linear algebra behind it...

متن کامل

Monte Carlo Methods of PageRank Computation

We describe and analyze an on-line Monte Carlo method of PageRank computation. The PageRank is being estimated basing on results of a large number of short independent simulation runs initiated from each page that contains outgoing hyperlinks. The method does not require any storage of the hyperlink matrix and is highly parallelizable. We study confidence intervals, and discover drawbacks of th...

متن کامل

Fast Incremental and Personalized PageRank

In this paper, we analyze the efficiency of Monte Carlo methods for incremental computation of PageRank, personalized PageRank, and similar random walk based methods (with focus on SALSA), on large-scale dynamically evolving social networks. We assume that the graph of friendships is stored in distributed shared memory, as is the case for large social networks such as Twitter. For global PageRa...

متن کامل

Stochastic Assessment of Voltage Sags in Distribution Networks

This paper compares fault position and Monte Carlo methods as the most common methods in stochastic assessment of voltage sags. To compare their abilities, symmetrical and unsymmetrical faults with different probability distribution of fault positions along the lines are applied in a test system. The voltage sag magnitude in different nodes of test system is calculated. The problem with the...

متن کامل

Fast Bidirectional Probability Estimation in Markov Models

We develop a new bidirectional algorithm for estimating Markov chain multi-step transition probabilities: given a Markov chain, we want to estimate the probability of hitting a given target state in ` steps after starting from a given source distribution. Given the target state t, we use a (reverse) local power iteration to construct an ‘expanded target distribution’, which has the same mean as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Numerical Analysis

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2007